Content Mining and Network Analysis of Microblog Spam

نویسندگان

  • Shen Yang
  • Li Shuchen
  • Ye Xiaoxiao
  • He Fangping
چکیده

The number of microblogs’ user is growing rapidly with the increase of spam. Firstly, we give microblog a formal definition, and then divide spam into two types: news and advertisements. We collect 1,760,314 items of 188MB microblog news to complete the process of content mining. Using ROST Content Mining, we work on topology macro statistics, time series mining, and so on. We find that the group of microblog presents the feature of small world. Its coefficient with the same degree is negative and the probability of news microblog followers is 0.0002, while the rate of second spread is 0.011.We put forward a recursive filtering method to estimate the rate of spread on many occasions and we import cross-relation method that switches the node that are difficult for network analysis to easy forms and do social network analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Pharmaceutical Spam in Microblog Messages

Microblogs are one of a growing group of social network tools. Twitter is, at present, one of the most popular forums for microblogging in online social networks, and the fastest growing. Fifty million messages flow through servers, computers, and cell phones on a wide variety of topics exchanged daily. With this considerable volume, Twitter is a natural and obvious target for spreading spam vi...

متن کامل

ارائه روشی مناسب برای دسته بندی نامه های الکترونیکی تبلیغاتی بر مبنای پروفایل کاربران

In general, Spam is related to satisfy or not satisfy the client and isn’t related to the content of the client’s email. According to this definition, problems arise in the field of marketing and advertising for example, it is possible that some of the advertising emails become spam for some users, and not spam for others. To deal with this problem, many researchers design an anti-s...

متن کامل

An Effective Model for SMS Spam Detection Using Content-based Features and Averaged Neural Network

In recent years, there has been considerable interest among people to use short message service (SMS) as one of the essential and straightforward communications services on mobile devices. The increased popularity of this service also increased the number of mobile devices attacks such as SMS spam messages. SMS spam messages constitute a real problem to mobile subscribers; this worries telecomm...

متن کامل

Characterizing SMS spam in a large cellular network via mining victim spam reports

In this paper 1 a study of SMS messages in a large US based cellular carrier utilizing both customer reported SMS spam and network Call Detail Records (CDRs) is conducted to develop a comprehensive understanding of SMS spam in order to develop strategies and approaches to detect and control SMS spam activity. The analysis provides insights into content classification of spam campaigns as well a...

متن کامل

CLEF 2017 Microblog Cultural Contextualization Content Analysis task Overview

The MC2 CLEF 2017 Content Analysis task deals with classification, filtering, language recognition, localization, entity extraction, linking open data, and summarization. Festivals have a large presence on social media. The resulting microblog stream and related URLs are appropriate to experiment on advanced social media search and mining methods. For content analysis, topics were in any langua...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JCIT

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2010